Overview

Dataset statistics

Number of variables9
Number of observations4177
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory497.8 KiB
Average record size in memory122.0 B

Variable types

Categorical1
Numeric8

Alerts

Length is highly correlated with Diameter and 6 other fieldsHigh correlation
Diameter is highly correlated with Length and 6 other fieldsHigh correlation
Height is highly correlated with Length and 6 other fieldsHigh correlation
Whole weight is highly correlated with Length and 6 other fieldsHigh correlation
Shucked weight is highly correlated with Length and 6 other fieldsHigh correlation
Viscera weight is highly correlated with Length and 6 other fieldsHigh correlation
Shell weight is highly correlated with Length and 6 other fieldsHigh correlation
Rings is highly correlated with Length and 6 other fieldsHigh correlation
Length is highly correlated with Diameter and 6 other fieldsHigh correlation
Diameter is highly correlated with Length and 6 other fieldsHigh correlation
Height is highly correlated with Length and 6 other fieldsHigh correlation
Whole weight is highly correlated with Length and 6 other fieldsHigh correlation
Shucked weight is highly correlated with Length and 5 other fieldsHigh correlation
Viscera weight is highly correlated with Length and 6 other fieldsHigh correlation
Shell weight is highly correlated with Length and 6 other fieldsHigh correlation
Rings is highly correlated with Length and 5 other fieldsHigh correlation
Length is highly correlated with Diameter and 5 other fieldsHigh correlation
Diameter is highly correlated with Length and 5 other fieldsHigh correlation
Height is highly correlated with Length and 6 other fieldsHigh correlation
Whole weight is highly correlated with Length and 5 other fieldsHigh correlation
Shucked weight is highly correlated with Length and 5 other fieldsHigh correlation
Viscera weight is highly correlated with Length and 5 other fieldsHigh correlation
Shell weight is highly correlated with Length and 6 other fieldsHigh correlation
Rings is highly correlated with Height and 1 other fieldsHigh correlation
Sex is highly correlated with Length and 6 other fieldsHigh correlation
Length is highly correlated with Sex and 7 other fieldsHigh correlation
Diameter is highly correlated with Sex and 7 other fieldsHigh correlation
Height is highly correlated with Length and 6 other fieldsHigh correlation
Whole weight is highly correlated with Sex and 7 other fieldsHigh correlation
Shucked weight is highly correlated with Sex and 7 other fieldsHigh correlation
Viscera weight is highly correlated with Sex and 7 other fieldsHigh correlation
Shell weight is highly correlated with Sex and 7 other fieldsHigh correlation
Rings is highly correlated with Sex and 7 other fieldsHigh correlation

Reproduction

Analysis started2022-02-15 20:03:38.678187
Analysis finished2022-02-15 20:03:46.276083
Duration7.6 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

Sex
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size236.7 KiB
M
1528 
I
1342 
F
1307 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowM
2nd rowM
3rd rowF
4th rowM
5th rowI

Common Values

ValueCountFrequency (%)
M1528
36.6%
I1342
32.1%
F1307
31.3%

Length

Overview

Dataset statistics

Number of variables8
Number of observations4175
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory175.4 KiB
Average record size in memory43.0 B

Variable types

Numeric5
Categorical3

Alerts

Length is highly correlated with Height and 3 other fieldsHigh correlation
Height is highly correlated with Length and 3 other fieldsHigh correlation
Whole weight is highly correlated with Length and 3 other fieldsHigh correlation
Age is highly correlated with Length and 3 other fieldsHigh correlation
Sex_F is highly correlated with Sex_MHigh correlation
Sex_I is highly correlated with Length and 4 other fieldsHigh correlation
Sex_M is highly correlated with Sex_F and 1 other fieldsHigh correlation
Length is highly correlated with Height and 3 other fieldsHigh correlation
Height is highly correlated with Length and 3 other fieldsHigh correlation
Whole weight is highly correlated with Length and 3 other fieldsHigh correlation
Age is highly correlated with Length and 2 other fieldsHigh correlation
Sex_F is highly correlated with Sex_MHigh correlation
Sex_I is highly correlated with Length and 3 other fieldsHigh correlation
Sex_M is highly correlated with Sex_F and 1 other fieldsHigh correlation
Length is highly correlated with Height and 1 other fieldsHigh correlation
Height is highly correlated with Length and 2 other fieldsHigh correlation
Whole weight is highly correlated with Length and 1 other fieldsHigh correlation
Age is highly correlated with HeightHigh correlation
Sex_F is highly correlated with Sex_MHigh correlation
Sex_I is highly correlated with Sex_MHigh correlation
Sex_M is highly correlated with Sex_F and 1 other fieldsHigh correlation
Sex_F is highly correlated with Sex_MHigh correlation
Sex_I is highly correlated with Sex_MHigh correlation
Sex_M is highly correlated with Sex_F and 1 other fieldsHigh correlation
Length is highly correlated with Height and 3 other fieldsHigh correlation
Height is highly correlated with Length and 2 other fieldsHigh correlation
Whole weight is highly correlated with Length and 3 other fieldsHigh correlation
Age is highly correlated with Length and 3 other fieldsHigh correlation
Sex_F is highly correlated with Sex_I and 1 other fieldsHigh correlation
Sex_I is highly correlated with Length and 4 other fieldsHigh correlation
Sex_M is highly correlated with Sex_F and 1 other fieldsHigh correlation
df_index is uniformly distributed Uniform
df_index has unique values Unique

Reproduction

Analysis started2022-02-15 20:43:21.103133
Analysis finished2022-02-15 20:43:26.030583
Duration4.93 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

UNIFORM
UNIQUE

Distinct4175
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2087.742036
Minimum0
Maximum4176
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size32.7 KiB

Quantile statistics

Minimum0
5-th percentile208.7
Q11043.5
median2088
Q33131.5
95-th percentile3966.3
Maximum4176
Range4176
Interquartile range (IQR)2088

Descriptive statistics

Standard deviation1205.799036
Coefficient of variation (CV)0.577561315
Kurtosis-1.199849531
Mean2087.742036
Median Absolute Deviation (MAD)1044
Skewness-0.0002290815981
Sum8716323
Variance1453951.314
MonotonicityStrictly increasing
Histogram of lengths of the category

Pie chart

Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
13341
 
< 0.1%
13381
 
< 0.1%
33871
 
< 0.1%
13421
 
< 0.1%
33911
 
< 0.1%
13461
 
< 0.1%
33951
 
< 0.1%
13501
 
< 0.1%
33991
 
< 0.1%
Other values (4165)4165
99.8%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
41761
< 0.1%
41751
< 0.1%
41741
< 0.1%
41731
< 0.1%
41721
< 0.1%
41711
< 0.1%
41701
< 0.1%
41691
< 0.1%
41681
< 0.1%
41671
< 0.1%

Length
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct134
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.7184580322
Minimum0.2738612788
Maximum0.9027735043
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size32.7 KiB
ValueCountFrequency (%)
m1528
36.6%
i1342
32.1%
f1307
31.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Length
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct134
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5239920996
Minimum0.075
Maximum0.815
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size32.8 KiB

Quantile statistics

Minimum0.075
5-th percentile0.295
Q10.45
median0.545
Q30.615
95-th percentile0.69
Maximum0.815
Range0.74
Interquartile range (IQR)0.165

Descriptive statistics

Standard deviation0.1200929126
Coefficient of variation (CV)0.2291884031
Kurtosis0.06462097389
Mean0.5239920996
Median Absolute Deviation (MAD)0.08
Skewness-0.639873269
Sum2188.715
Variance0.01442230765
MonotonicityNot monotonic

Quantile statistics

Minimum0.2738612788
5-th percentile0.5431390246
Q10.6708203932
median0.738241153
Q30.7842193571
95-th percentile0.8306623863
Maximum0.9027735043
Range0.6289122255
Interquartile range (IQR)0.1133989638

Descriptive statistics

Standard deviation0.08879535514
Coefficient of variation (CV)0.1235915685
Kurtosis1.104825505
Mean0.7184580322
Median Absolute Deviation (MAD)0.05267569297
Skewness-1.022225458
Sum2999.562285
Variance0.007884615094
MonotonicityNot monotonic
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.62594
 
2.3%
0.5594
 
2.3%
0.57593
 
2.2%
0.5892
 
2.2%
0.687
 
2.1%
0.6287
 
2.1%
0.581
 
1.9%
0.5779
 
1.9%
0.6378
 
1.9%
0.6175
 
1.8%
Other values (124)3317
79.4%
ValueCountFrequency (%)
0.0751
 
< 0.1%
0.111
 
< 0.1%
0.132
 
< 0.1%
0.1351
 
< 0.1%
0.142
 
< 0.1%
0.151
 
< 0.1%
0.1553
0.1%
0.164
0.1%
0.1655
0.1%
0.173
0.1%
ValueCountFrequency (%)
0.8151
 
< 0.1%
0.81
 
< 0.1%
0.782
 
< 0.1%
0.7752
 
< 0.1%
0.773
 
0.1%
0.7652
 
< 0.1%
0.762
 
< 0.1%
0.7553
 
0.1%
0.758
0.2%
0.7455
0.1%

Diameter
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct111
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4078812545
Minimum0.055
Maximum0.65
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size32.8 KiB
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.741619848794
 
2.3%
0.79056941594
 
2.3%
0.758287544493
 
2.2%
0.761577310692
 
2.2%
0.774596669287
 
2.1%
0.787400787487
 
2.1%
0.707106781281
 
1.9%
0.754983443579
 
1.9%
0.793725393378
 
1.9%
0.781024967675
 
1.8%
Other values (124)3315
79.4%
ValueCountFrequency (%)
0.27386127881
 
< 0.1%
0.3316624791
 
< 0.1%
0.36055512752
 
< 0.1%
0.36742346141
 
< 0.1%
0.37416573872
 
< 0.1%
0.38729833461
 
< 0.1%
0.39370039373
0.1%
0.44
0.1%
0.40620192025
0.1%
0.41231056263
0.1%
ValueCountFrequency (%)
0.90277350431
 
< 0.1%
0.8944271911
 
< 0.1%
0.88317608662
 
< 0.1%
0.88034084312
 
< 0.1%
0.87749643873
 
0.1%
0.87464278422
 
< 0.1%
0.87177978872
 
< 0.1%
0.86890735983
 
0.1%
0.86602540388
0.2%
0.86313382515
0.1%

Height
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct50
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3694087725
Minimum0.1
Maximum1.063014581
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size32.7 KiB
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">

Quantile statistics

Minimum0.055
5-th percentile0.22
Q10.35
median0.425
Q30.48
95-th percentile0.545
Maximum0.65
Range0.595
Interquartile range (IQR)0.13

Descriptive statistics

Standard deviation0.09923986613
Coefficient of variation (CV)0.2433057784
Kurtosis-0.04547558144
Mean0.4078812545
Median Absolute Deviation (MAD)0.065
Skewness-0.6091981423
Sum1703.72
Variance0.00984855103
MonotonicityNot monotonic

Quantile statistics

Minimum0.1
5-th percentile0.2738612788
Q10.3391164992
median0.3741657387
Q30.4062019202
95-th percentile0.4472135955
Maximum1.063014581
Range0.9630145813
Interquartile range (IQR)0.06708542108

Descriptive statistics

Standard deviation0.05586716309
Coefficient of variation (CV)0.1512339913
Kurtosis6.496555339
Mean0.3694087725
Median Absolute Deviation (MAD)0.03504923952
Skewness-0.1760118468
Sum1542.281625
Variance0.003121139912
MonotonicityNot monotonic
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.45139
 
3.3%
0.475120
 
2.9%
0.4111
 
2.7%
0.5110
 
2.6%
0.47100
 
2.4%
0.4891
 
2.2%
0.45590
 
2.2%
0.4689
 
2.1%
0.4487
 
2.1%
0.48583
 
2.0%
Other values (101)3157
75.6%
ValueCountFrequency (%)
0.0551
 
< 0.1%
0.091
 
< 0.1%
0.0951
 
< 0.1%
0.12
 
< 0.1%
0.1054
0.1%
0.114
0.1%
0.1152
 
< 0.1%
0.125
0.1%
0.1257
0.2%
0.138
0.2%
ValueCountFrequency (%)
0.651
 
< 0.1%
0.633
 
0.1%
0.6251
 
< 0.1%
0.621
 
< 0.1%
0.6151
 
< 0.1%
0.611
 
< 0.1%
0.6053
 
0.1%
0.68
0.2%
0.5954
0.1%
0.596
0.1%

Height
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct51
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1395163993
Minimum0
Maximum1.13
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size32.8 KiB
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.3872983346267
 
6.4%
0.3741657387220
 
5.3%
0.3937003937217
 
5.2%
0.4183300133211
 
5.1%
0.4205
 
4.9%
0.3535533906202
 
4.8%
0.4062019202193
 
4.6%
0.3674234614189
 
4.5%
0.3807886553182
 
4.4%
0.3464101615169
 
4.0%
Other values (40)2120
50.8%
ValueCountFrequency (%)
0.11
 
< 0.1%
0.12247448712
 
< 0.1%
0.14142135622
 
< 0.1%
0.1581138835
 
0.1%
0.17320508086
 
0.1%
0.18708286936
 
0.1%
0.213
0.3%
0.212132034411
0.3%
0.223606797718
0.4%
0.23452078825
0.6%
ValueCountFrequency (%)
1.0630145811
 
< 0.1%
0.71763500471
 
< 0.1%
0.53
 
0.1%
0.48989794864
 
0.1%
0.48476798576
 
0.1%
0.479583152310
 
0.2%
0.47434164913
0.3%
0.46904157617
0.4%
0.463680924831
0.7%
0.458257569523
0.6%

Whole weight
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2427
Distinct (%)58.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.8290045509
Minimum0.002
Maximum2.8255
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size32.7 KiB
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">

Quantile statistics

Minimum0
5-th percentile0.075
Q10.115
median0.14
Q30.165
95-th percentile0.2
Maximum1.13
Range1.13
Interquartile range (IQR)0.05

Descriptive statistics

Standard deviation0.04182705661
Coefficient of variation (CV)0.2998002873
Kurtosis76.02550923
Mean0.1395163993
Median Absolute Deviation (MAD)0.025
Skewness3.128817379
Sum582.76
Variance0.001749502664
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.15267
 
6.4%
0.14220
 
5.3%
0.155217
 
5.2%
0.175211
 
5.1%
0.16205
 
4.9%
0.125202
 
4.8%
0.165193
 
4.6%
0.135189
 
4.5%
0.145182
 
4.4%
0.12169
 
4.0%
Other values (41)2122
50.8%
ValueCountFrequency (%)
02
 
< 0.1%
0.011
 
< 0.1%
0.0152
 
< 0.1%
0.022
 
< 0.1%
0.0255
 
0.1%
0.036
 
0.1%
0.0356
 
0.1%
0.0413
0.3%
0.04511
0.3%
0.0518
0.4%
ValueCountFrequency (%)
1.131
 
< 0.1%
0.5151
 
< 0.1%
0.253
 
0.1%
0.244
 
0.1%
0.2356
 
0.1%
0.2310
 
0.2%
0.22513
0.3%
0.2217
0.4%
0.21531
0.7%
0.2123
0.6%

Whole weight
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2429
Distinct (%)58.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.8287421594
Minimum0.002
Maximum2.8255
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size32.8 KiB

Quantile statistics

Minimum0.002
5-th percentile0.12585
Q10.44225
median0.8
Q31.1535
95-th percentile1.6956
Maximum2.8255
Range2.8235
Interquartile range (IQR)0.71125

Descriptive statistics

Standard deviation0.4903493012
Coefficient of variation (CV)0.5914916881
Kurtosis-0.02346199245
Mean0.8290045509
Median Absolute Deviation (MAD)0.356
Skewness0.5305486493
Sum3461.094
Variance0.2404424372
MonotonicityNot monotonic

Quantile statistics

Minimum0.002
5-th percentile0.1259
Q10.4415
median0.7995
Q31.153
95-th percentile1.6949
Maximum2.8255
Range2.8235
Interquartile range (IQR)0.7115

Descriptive statistics

Standard deviation0.4903890182
Coefficient of variation (CV)0.5917268871
Kurtosis-0.02364350427
Mean0.8287421594
Median Absolute Deviation (MAD)0.3565
Skewness0.5309585633
Sum3461.656
Variance0.2404813892
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.22258
 
0.2%
0.47757
 
0.2%
0.1967
 
0.2%
0.977
 
0.2%
1.13457
 
0.2%
0.186
 
0.1%
0.67656
 
0.1%
0.4946
 
0.1%
0.32456
 
0.1%
0.58056
 
0.1%
Other values (2417)4109
98.4%
ValueCountFrequency (%)
0.0021
< 0.1%
0.0081
< 0.1%
0.01051
< 0.1%
0.0131
< 0.1%
0.0141
< 0.1%
0.01452
< 0.1%
0.0151
< 0.1%
0.01551
< 0.1%
0.01751
< 0.1%
0.0182
< 0.1%
ValueCountFrequency (%)
2.82551
< 0.1%
2.77951
< 0.1%
2.6571
< 0.1%
2.5551
< 0.1%
2.551
< 0.1%
2.5481
< 0.1%
2.5261
< 0.1%
2.51551
< 0.1%
2.50851
< 0.1%
2.5051
< 0.1%

Age
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct28
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.43508982
Minimum2.5
Maximum30.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size32.7 KiB

Quantile statistics

Minimum2.5
5-th percentile7.5
Q19.5
median10.5
Q312.5
95-th percentile17.5
Maximum30.5
Range28
Interquartile range (IQR)3

Descriptive statistics

Standard deviation3.224227336
Coefficient of variation (CV)0.2819590739
Kurtosis2.330351976
Mean11.43508982
Median Absolute Deviation (MAD)2
Skewness1.113754261
Sum47741.5
Variance10.39564191
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.22258
 
0.2%
0.1967
 
0.2%
0.47757
 
0.2%
0.977
 
0.2%
1.13457
 
0.2%
0.186
 
0.1%
0.67656
 
0.1%
0.58056
 
0.1%
0.32456
 
0.1%
0.4946
 
0.1%
Other values (2419)4111
98.4%
ValueCountFrequency (%)
0.0021
< 0.1%
0.0081
< 0.1%
0.01051
< 0.1%
0.0131
< 0.1%
0.0141
< 0.1%
0.01452
< 0.1%
0.0151
< 0.1%
0.01551
< 0.1%
0.01751
< 0.1%
0.0182
< 0.1%
ValueCountFrequency (%)
2.82551
< 0.1%
2.77951
< 0.1%
2.6571
< 0.1%
2.5551
< 0.1%
2.551
< 0.1%
2.5481
< 0.1%
2.5261
< 0.1%
2.51551
< 0.1%
2.50851
< 0.1%
2.5051
< 0.1%

Shucked weight
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1515
Distinct (%)36.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3593674886
Minimum0.001
Maximum1.488
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size32.8 KiB
Histogram with fixed size bins (bins=28)
ValueCountFrequency (%)
10.5689
16.5%
11.5634
15.2%
9.5567
13.6%
12.5487
11.7%
8.5391
9.4%
13.5267
 
6.4%
7.5258
 
6.2%
14.5203
 
4.9%
15.5126
 
3.0%
6.5115
 
2.8%
Other values (18)438
10.5%
ValueCountFrequency (%)
2.51
 
< 0.1%
3.51
 
< 0.1%
4.515
 
0.4%
5.557
 
1.4%
6.5115
 
2.8%
7.5258
 
6.2%
8.5391
9.4%
9.5567
13.6%
10.5689
16.5%
11.5634
15.2%
ValueCountFrequency (%)
30.51
 
< 0.1%
28.52
 
< 0.1%
27.51
 
< 0.1%
26.51
 
< 0.1%
25.52
 
< 0.1%
24.59
 
0.2%
23.56
 
0.1%
22.514
0.3%
21.526
0.6%
20.532
0.8%

Sex_F
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size236.6 KiB
0
2868 
1
1307 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
02868
68.7%
11307
31.3%

Length

Quantile statistics

Minimum0.001
5-th percentile0.0524
Q10.186
median0.336
Q30.502
95-th percentile0.7402
Maximum1.488
Range1.487
Interquartile range (IQR)0.316

Descriptive statistics

Standard deviation0.221962949
Coefficient of variation (CV)0.6176489417
Kurtosis0.5951236784
Mean0.3593674886
Median Absolute Deviation (MAD)0.1585
Skewness0.7190979218
Sum1501.078
Variance0.04926755074
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.17511
 
0.3%
0.250510
 
0.2%
0.1659
 
0.2%
0.0979
 
0.2%
0.219
 
0.2%
0.4199
 
0.2%
0.3029
 
0.2%
0.0969
 
0.2%
0.20259
 
0.2%
0.29459
 
0.2%
Other values (1505)4084
97.8%
ValueCountFrequency (%)
0.0011
 
< 0.1%
0.00251
 
< 0.1%
0.00452
< 0.1%
0.0053
0.1%
0.00552
< 0.1%
0.00653
0.1%
0.0071
 
< 0.1%
0.00754
0.1%
0.0081
 
< 0.1%
0.00851
 
< 0.1%
ValueCountFrequency (%)
1.4881
< 0.1%
1.3511
< 0.1%
1.34851
< 0.1%
1.2531
< 0.1%
1.24551
< 0.1%
1.23952
< 0.1%
1.2321
< 0.1%
1.19651
< 0.1%
1.19451
< 0.1%
1.17051
< 0.1%

Viscera weight
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct880
Distinct (%)21.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1805936079
Minimum0.0005
Maximum0.76
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size32.8 KiB

Quantile statistics

Minimum0.0005
5-th percentile0.027
Q10.0935
median0.171
Q30.253
95-th percentile0.3796
Maximum0.76
Range0.7595
Interquartile range (IQR)0.1595

Descriptive statistics

Standard deviation0.1096142503
Coefficient of variation (CV)0.6069663902
Kurtosis0.084011749
Mean0.1805936079
Median Absolute Deviation (MAD)0.0795
Skewness0.5918521514
Sum754.3395
Variance0.01201528386
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.171515
 
0.4%
0.19614
 
0.3%
0.03713
 
0.3%
0.06113
 
0.3%
0.057513
 
0.3%
0.219513
 
0.3%
0.15612
 
0.3%
0.09612
 
0.3%
0.026512
 
0.3%
0.162512
 
0.3%
Other values (870)4048
96.9%
ValueCountFrequency (%)
0.00052
 
< 0.1%
0.0021
 
< 0.1%
0.00252
 
< 0.1%
0.0033
0.1%
0.00353
0.1%
0.0041
 
< 0.1%
0.00454
0.1%
0.0057
0.2%
0.00556
0.1%
0.0062
 
< 0.1%
ValueCountFrequency (%)
0.761
< 0.1%
0.64151
< 0.1%
0.591
< 0.1%
0.5751
< 0.1%
0.57451
< 0.1%
0.5641
< 0.1%
0.551
< 0.1%
0.5412
< 0.1%
0.52651
< 0.1%
0.5261
< 0.1%

Shell weight
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct926
Distinct (%)22.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2388308595
Minimum0.0015
Maximum1.005
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size32.8 KiB

Quantile statistics

Minimum0.0015
5-th percentile0.0384
Q10.13
median0.234
Q30.329
95-th percentile0.48
Maximum1.005
Range1.0035
Interquartile range (IQR)0.199

Descriptive statistics

Standard deviation0.1392026695
Coefficient of variation (CV)0.5828504316
Kurtosis0.5319261262
Mean0.2388308595
Median Absolute Deviation (MAD)0.0995
Skewness0.6209268251
Sum997.5965
Variance0.0193773832
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.27543
 
1.0%
0.2542
 
1.0%
0.31540
 
1.0%
0.26540
 
1.0%
0.18540
 
1.0%
0.1737
 
0.9%
0.28537
 
0.9%
0.17536
 
0.9%
0.336
 
0.9%
0.2236
 
0.9%
Other values (916)3790
90.7%
ValueCountFrequency (%)
0.00151
 
< 0.1%
0.0031
 
< 0.1%
0.00351
 
< 0.1%
0.0042
 
< 0.1%
0.00512
0.3%
0.0061
 
< 0.1%
0.00651
 
< 0.1%
0.0071
 
< 0.1%
0.00751
 
< 0.1%
0.0084
 
0.1%
ValueCountFrequency (%)
1.0051
 
< 0.1%
0.8971
 
< 0.1%
0.8852
< 0.1%
0.851
 
< 0.1%
0.8151
 
< 0.1%
0.79751
 
< 0.1%
0.781
 
< 0.1%
0.761
 
< 0.1%
0.7261
 
< 0.1%
0.7253
0.1%

Rings
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct28
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.933684463
Minimum1
Maximum29
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size32.8 KiB

Quantile statistics

Minimum1
5-th percentile6
Q18
median9
Q311
95-th percentile16
Maximum29
Range28
Interquartile range (IQR)3

Descriptive statistics

Standard deviation3.224169032
Coefficient of variation (CV)0.324569302
Kurtosis2.330687427
Mean9.933684463
Median Absolute Deviation (MAD)2
Skewness1.114101898
Sum41493
Variance10.39526595
MonotonicityNot monotonic
Histogram with fixed size bins (bins=28)
ValueCountFrequency (%)
9689
16.5%
10634
15.2%
8568
13.6%
11487
11.7%
7391
9.4%
12267
 
6.4%
6259
 
6.2%
13203
 
4.9%
14126
 
3.0%
5115
 
2.8%
Other values (18)438
10.5%
ValueCountFrequency (%)
11
 
< 0.1%
21
 
< 0.1%
315
 
0.4%
457
 
1.4%
5115
 
2.8%
6259
 
6.2%
7391
9.4%
8568
13.6%
9689
16.5%
10634
15.2%
ValueCountFrequency (%)
291
 
< 0.1%
272
 
< 0.1%
261
 
< 0.1%
251
 
< 0.1%
242
 
< 0.1%
239
 
0.2%
226
 
0.1%
2114
0.3%
2026
0.6%
1932
0.8%

Interactions

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
02868
68.7%
11307
31.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Sex_I
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size236.6 KiB
0
2835 
1
1340 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
02835
67.9%
11340
32.1%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
02835
67.9%
11340
32.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Sex_M
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size236.6 KiB
0
2647 
1
1528 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
02647
63.4%
11528
36.6%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
02647
63.4%
11528
36.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Interactions

>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">

Correlations

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
A simple visualization of nullity by column.
A simple visualization of nullity by column.
>>>>>> 4e89958cb6cafc8e751ef012b005e36d3fc5002c "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

SexLengthDiameterHeightWhole weightShucked weightViscera weightShell weightRings
0M0.4550.3650.0950.51400.22450.10100.15015
1M0.3500.2650.0900.22550.09950.04850.0707
2F0.5300.4200.1350.67700.25650.14150.2109
3M0.4400.3650.1250.51600.21550.11400.15510
4I0.3300.2550.0800.20500.08950.03950.0557
5I0.4250.3000.0950.35150.14100.07750.1208
6F0.5300.4150.1500.77750.23700.14150.33020
7F0.5450.4250.1250.76800.29400.14950.26016
8M0.4750.3700.1250.50950.21650.11250.1659
9F0.5500.4400.1500.89450.31450.15100.32019

Last rows

SexLengthDiameterHeightWhole weightShucked weightViscera weightShell weightRings
4167M0.5000.3800.1250.57700.26900.12650.15359
4168F0.5150.4000.1250.61500.28650.12300.17658
4169M0.5200.3850.1650.79100.37500.18000.181510
4170M0.5500.4300.1300.83950.31550.19550.240510
4171M0.5600.4300.1550.86750.40000.17200.22908
4172F0.5650.4500.1650.88700.37000.23900.249011
4173M0.5900.4400.1350.96600.43900.21450.260510
4174M0.6000.4750.2051.17600.52550.28750.30809
4175F0.6250.4850.1501.09450.53100.26100.296010
4176M0.7100.5550.1951.94850.94550.37650.495012